Analyzing the Popular Words to Evaluate Spam in Arabic Web Pages

نویسندگان

  • Heider A. Wahsheh
  • Izzat M. Alsmadi
  • Mohammed N. Al-Kabi
چکیده

The extensive expansion and use of the Web and the Internet comes at the price of seducing a number of intruders to utilize the Web for destructive purposes. In the scope of Websites and Web pages, spammers try to inject their own content and pages in Web sites and search engine search results to be more visible to users and attract users to their Websites or products. This paper analyses the behaviors of the spammers in the contentbased Arabic Web pages, through analyzing the weights of the most ten popular Arabic words used by Arab users in their queries. The results show that the behavior of the spammers in the Arabic Web pages can be unique and distinguished in comparison to other languages. Decision Tree was used to evaluate this behavior and it obtains the degree of accuracy which is equal to 90%. General Terms Security, Legal Aspects.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of Web Spam for Non-English Content: Toward More Effective Language-Based Classifiers

Web spammers aim to obtain higher ranks for their web pages by including spam contents that deceive search engines in order to include their pages in search results even when they are not related to the search terms. Search engines continue to develop new web spam detection mechanisms, but spammers also aim to improve their tools to evade detection. In this study, we first explore the effect of...

متن کامل

OLAWSDS: An Online Arabic Web Spam Detection System

For marketing purposes, Some Websites designers and administrators use illegal Search Engine Optimization (SEO) techniques to optimize the ranking of their Web pages and mislead the search engines. Some Arabic Web pages use both content and link features, to increase artificially the rank of their Web pages in the Search Engine Results Pages (SERPs). This study represents an enhancement to prev...

متن کامل

Analyzing new features of infected web content in detection of malicious web pages

Recent improvements in web standards and technologies enable the attackers to hide and obfuscate infectious codes with new methods and thus escaping the security filters. In this paper, we study the application of machine learning techniques in detecting malicious web pages. In order to detect malicious web pages, we propose and analyze a novel set of features including HTML, JavaScript (jQuery...

متن کامل

Fast Asynchronous Anti-TrustRank for Web Spam Detection

Web spam detection is an important problem in Web search. Since Web spam pages tend to have a lot of spurious links, many Web spam detection algorithms exploit the hyperlink structure between the Web pages to detect the spam pages. Anti-TrustRank algorithm is a well-known link-based spam detection algorithm which follows the principle that spam pages are likely to be referenced by other spam pa...

متن کامل

DSpin: Detecting Automatically Spun Content on the Web

Web spam is an abusive search engine optimization technique that artificially boosts the search result rank of pages promoted in the spam content. A popular form of Web spam today relies upon automated spinning to avoid duplicate detection. Spinning replaces words or phrases in an input article to create new versions with vaguely similar meaning but sufficiently different appearance to avoid pl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012